Assignment

A short description of the post.

Yining Bai true
07-13-2021

Introduction

Many of the Abila, Kronos-based employees of GAStech have company cars which are approved for both personal and business use. Those who do not have company cars have the ability to check out company trucks for business use, but these trucks cannot be used for personal business.

Employees with company cars are happy to have these vehicles, because the company cars are generally much higher quality than the cars they would be able to afford otherwise. However, GAStech does not trust their employees. Without the employees? knowledge, GAStech has installed geospatial tracking software in the company vehicles. The vehicles are tracked periodically as long as they are moving.

This vehicle tracking data has been made available to law enforcement to support their investigation. Unfortunately, data is not available for the day the GAStech employees went missing. Data is only available for the two weeks prior to the disappearance.

To promote local businesses, Kronos based companies provide a Kronos Kares benefit card to GASTech employees giving them discounts and rewards in exchange for collecting information about their credit card purchases and preferences as recorded on loyalty cards. This data has been made available to investigators in the hopes that it can help resolve the situation. However, Kronos Kares does not collect personal information beyond purchases.

Question1

In this question, we need to find out the popular locations. Dataset cc_data and loyalty_data will be used in this question.

Firstly, a loop structure is created to library all the packages needed.

packages = c('igraph', 'tidygraph', 'ggraph', 'visNetwork', 'lubridate', 'clock', 'tidyverse','dplyr', 'tidyr','raster','sf','sp','tmap', 'gifski', 'writexl', 'mapview', "ggplot2", 'dplyr')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

Load the credit card dataset in the environment. We can also use the function “glimpse” to check the data type of the variables.

credit_card <- read.csv("data_MC2/cc_data.csv")
glimpse(credit_card)
Rows: 1,490
Columns: 4
$ timestamp  <chr> "1/6/2014 7:28", "1/6/2014 7:34", "1/6/2014 7:35"~
$ location   <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price      <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~

Load the loyalty card dataset in the environment.

loyalty_card <- read.csv("data_MC2/loyalty_data.csv")
glimpse(loyalty_card)
Rows: 1,392
Columns: 4
$ timestamp  <chr> "1/6/2014", "1/6/2014", "1/6/2014", "1/6/2014", "~
$ location   <chr> "Brew've Been Served", "Brew've Been Served", "Ha~
$ price      <dbl> 4.17, 9.60, 16.53, 11.51, 12.93, 4.27, 11.20, 15.~
$ loyaltynum <chr> "L2247", "L9406", "L8328", "L6417", "L1107", "L40~

Change the datatype of variable timestamp from character to date-time format.

credit_card$timestamp <- date_time_parse(credit_card$timestamp,
                zone = "",
                format = "%m/%d/%Y %H:%M")
glimpse(credit_card)
Rows: 1,490
Columns: 4
$ timestamp  <dttm> 2014-01-06 07:28:00, 2014-01-06 07:34:00, 2014-0~
$ location   <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price      <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~

Change the datatype of variable timestamp from character to date format.

loyalty_card$timestamp <- date_time_parse(loyalty_card$timestamp,
                zone = "",
                format = "%m/%d/%Y")
glimpse(loyalty_card)
Rows: 1,392
Columns: 4
$ timestamp  <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-06, ~
$ location   <chr> "Brew've Been Served", "Brew've Been Served", "Ha~
$ price      <dbl> 4.17, 9.60, 16.53, 11.51, 12.93, 4.27, 11.20, 15.~
$ loyaltynum <chr> "L2247", "L9406", "L8328", "L6417", "L1107", "L40~

In order to explore the further corresponding relationship between credit card and loyalty card, we need to join the two datatables together. However, considering that there is not a common column for joining, a new column “Date” is needed to be extracted from the variable timestamp in credit_card.csv.

credit_card$Date <- format(credit_card$timestamp, format="%Y-%m-%d")
credit_card$Date <- date_time_parse(credit_card$Date,
                           zone = "",
                           format = "%Y-%m-%d")
glimpse(credit_card)
Rows: 1,490
Columns: 5
$ timestamp  <dttm> 2014-01-06 07:28:00, 2014-01-06 07:34:00, 2014-0~
$ location   <chr> "Brew've Been Served", "Hallowed Grounds", "Brew'~
$ price      <dbl> 11.34, 52.22, 8.33, 16.72, 4.24, 4.17, 28.73, 9.6~
$ last4ccnum <int> 4795, 7108, 6816, 9617, 7384, 5368, 7253, 4948, 9~
$ Date       <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-06, ~

Then, we can use variable “Date”, “location”, and “price” to finish the full join of the two tables. According to the regulation of Gas Tech, the records of loyalty card are usually along with the records of credit card. Therefore, we can find out the combination of credit card and loyalty card after joining.

card_joined <- credit_card %>%
  full_join(loyalty_card, by = c("Date" = "timestamp", "location", "price"))

In order to find out the popular locations, we need to the calculate the number of records of locations and then choose those with more records. Popular locations are explored for credit cards and loyalty cards separately.

popular_credit_card <- credit_card %>%
  group_by(location) %>%
  summarise(count = n()) %>%
  arrange(desc(count))
popular_loyalty_card <- loyalty_card %>%
  group_by(location) %>%
  summarise(count = n()) %>%
  arrange(desc(count))

According to the datatable, we will consider the Top6 as the popular locations.

popular_top_credit <- popular_credit_card %>%
 
 gather(location, count) %>%
 arrange(desc(count)) %>%
 top_n(6)

popular_top_credit
# A tibble: 6 x 2
  location            count
  <chr>               <int>
1 Katerina's Cafe       212
2 Hippokampos           171
3 Guy's Gyros           158
4 Brew've Been Served   156
5 Hallowed Grounds       92
6 Ouzeri Elian           87

Then we can draw the bar chart of the popular locations selected.

top6_credit<-ggplot(data=popular_top_credit, aes(x=location, y=count)) +
  geom_bar(stat="identity", fill="steelblue")+
  theme_minimal()
top6_credit

popular_top_loyalty <- popular_loyalty_card %>%
 
 gather(location, count) %>%
 arrange(desc(count)) %>%
 top_n(6)

popular_top_loyalty
# A tibble: 6 x 2
  location            count
  <chr>               <int>
1 Katerina's Cafe       195
2 Hippokampos           155
3 Guy's Gyros           146
4 Brew've Been Served   140
5 Ouzeri Elian           84
6 Hallowed Grounds       80
top6_loyalty<-ggplot(data=popular_top_loyalty, aes(x=location, y=count)) +
  geom_bar(stat="identity", fill="steelblue")+
  theme_minimal()
top6_loyalty

Therefore, combining the result of loyalty card and credit card, the popular locations are the 6 locations shown before. And we can also create a new datatable only containing the card information of popular locations.

popular_locations <- card_joined %>%
  filter(location %in% c("Katerina's Cafe", "Hippokampos", "Guy's Gyros", "Brew've Been Served", "Ouzeri Elian", "Hallowed Grounds")) %>%
  drop_na(timestamp) %>%
  dplyr::select(-Date)

The peak time in the Katerina’s Cafe is 13pm - 14pm and 19pm - 20pm, which are the lunch time and dinner time respectively. Similarly, Hippokampos, Guy’s Gyros and Ouzeri Elian are all restaurants and they have the same peak hours as Katerina’s Cafe.

popular_locations %>%
  filter(location == "Katerina's Cafe")
              timestamp        location price last4ccnum loyaltynum
1   2014-01-06 12:56:00 Katerina's Cafe 13.68       8332       <NA>
2   2014-01-06 13:28:00 Katerina's Cafe 33.54       8411      L6110
3   2014-01-06 13:46:00 Katerina's Cafe 18.56       6899      L6267
4   2014-01-06 13:50:00 Katerina's Cafe 13.59       1874      L4424
5   2014-01-06 13:50:00 Katerina's Cafe 32.64       9617      L5553
6   2014-01-06 13:53:00 Katerina's Cafe 36.95       2142       <NA>
7   2014-01-06 13:59:00 Katerina's Cafe 39.41       3853      L1485
8   2014-01-06 14:17:00 Katerina's Cafe 19.65       6901      L9363
9   2014-01-06 18:56:00 Katerina's Cafe 78.65       8411       <NA>
10  2014-01-06 19:08:00 Katerina's Cafe 26.46       8129      L8328
11  2014-01-06 20:07:00 Katerina's Cafe 23.63       7253      L1682
12  2014-01-06 20:08:00 Katerina's Cafe 12.32       1874      L4424
13  2014-01-06 20:19:00 Katerina's Cafe 21.66       2418       <NA>
14  2014-01-06 20:29:00 Katerina's Cafe 17.92       9405      L3259
15  2014-01-06 20:30:00 Katerina's Cafe 28.00       6691      L6267
16  2014-01-06 20:33:00 Katerina's Cafe 15.52       5921      L3295
17  2014-01-06 20:59:00 Katerina's Cafe 26.10       3492      L7814
18  2014-01-06 21:03:00 Katerina's Cafe 10.72       9617       <NA>
19  2014-01-06 21:12:00 Katerina's Cafe 26.51       4948      L9406
20  2014-01-07 13:08:00 Katerina's Cafe 59.07       7792       <NA>
21  2014-01-07 13:33:00 Katerina's Cafe 22.27       7889      L6119
22  2014-01-07 13:41:00 Katerina's Cafe 51.41       3484       <NA>
23  2014-01-07 13:57:00 Katerina's Cafe 37.44       8202      L2343
24  2014-01-07 14:07:00 Katerina's Cafe 10.90       4434      L2169
25  2014-01-07 20:09:00 Katerina's Cafe 19.49       7253      L1682
26  2014-01-07 20:11:00 Katerina's Cafe 10.07       2142      L9637
27  2014-01-07 20:12:00 Katerina's Cafe 55.63       1874       <NA>
28  2014-01-07 20:13:00 Katerina's Cafe 94.75       7889       <NA>
29  2014-01-07 20:13:00 Katerina's Cafe 65.02       8202       <NA>
30  2014-01-07 20:18:00 Katerina's Cafe 31.60       8411      L6110
31  2014-01-07 20:32:00 Katerina's Cafe 19.53       5921      L3295
32  2014-01-07 20:33:00 Katerina's Cafe 17.26       9617      L5553
33  2014-01-07 20:33:00 Katerina's Cafe 25.54       6691      L6267
34  2014-01-07 20:47:00 Katerina's Cafe 32.83       3492      L7814
35  2014-01-07 21:01:00 Katerina's Cafe 33.29       9405      L3259
36  2014-01-07 21:19:00 Katerina's Cafe 18.35       1310      L8012
37  2014-01-08 13:19:00 Katerina's Cafe 29.26       7792      L5756
38  2014-01-08 13:28:00 Katerina's Cafe  8.53       6899      L6267
39  2014-01-08 13:34:00 Katerina's Cafe 88.98       1310       <NA>
40  2014-01-08 13:41:00 Katerina's Cafe 26.45       2142      L9637
41  2014-01-08 13:42:00 Katerina's Cafe 57.04       6895       <NA>
42  2014-01-08 13:47:00 Katerina's Cafe 27.05       1877       <NA>
43  2014-01-08 14:06:00 Katerina's Cafe 22.94       7108       <NA>
44  2014-01-08 19:32:00 Katerina's Cafe 34.25       5407      L4034
45  2014-01-08 20:00:00 Katerina's Cafe 13.83       8411      L6110
46  2014-01-08 20:08:00 Katerina's Cafe 31.59       1310      L8012
47  2014-01-08 20:27:00 Katerina's Cafe 33.31       2142      L9637
48  2014-01-08 20:31:00 Katerina's Cafe 36.60       7889      L6119
49  2014-01-08 20:35:00 Katerina's Cafe 16.93       8202      L2343
50  2014-01-08 20:42:00 Katerina's Cafe 92.83       5921       <NA>
51  2014-01-08 21:02:00 Katerina's Cafe 28.86       4948       <NA>
52  2014-01-08 21:09:00 Katerina's Cafe 27.87       1874      L4424
53  2014-01-09 13:03:00 Katerina's Cafe 17.02       7253      L1682
54  2014-01-09 13:23:00 Katerina's Cafe 16.98       7792      L5756
55  2014-01-09 13:27:00 Katerina's Cafe 25.85       8156      L5224
56  2014-01-09 13:35:00 Katerina's Cafe 10.42       3853      L1485
57  2014-01-09 14:06:00 Katerina's Cafe  8.01       1877      L3014
58  2014-01-09 19:25:00 Katerina's Cafe 35.92       8129      L8328
59  2014-01-09 19:30:00 Katerina's Cafe 26.60       5921      L9406
60  2014-01-09 19:30:00 Katerina's Cafe 26.60       5921      L3295
61  2014-01-09 20:06:00 Katerina's Cafe 26.60       4948      L9406
62  2014-01-09 20:06:00 Katerina's Cafe 26.60       4948      L3295
63  2014-01-09 20:08:00 Katerina's Cafe 19.02       3492      L7814
64  2014-01-09 20:09:00 Katerina's Cafe 90.97       6691       <NA>
65  2014-01-09 20:20:00 Katerina's Cafe 91.14       9405       <NA>
66  2014-01-09 20:22:00 Katerina's Cafe 29.82       8202      L2343
67  2014-01-09 20:27:00 Katerina's Cafe 53.78       5407       <NA>
68  2014-01-09 20:38:00 Katerina's Cafe 30.69       2418       <NA>
69  2014-01-09 21:18:00 Katerina's Cafe  8.57       1310      L8012
70  2014-01-10 13:30:00 Katerina's Cafe 34.76       7688      L4164
71  2014-01-10 13:36:00 Katerina's Cafe 25.36       7354      L9254
72  2014-01-10 13:38:00 Katerina's Cafe 21.22       2418      L9018
73  2014-01-10 13:47:00 Katerina's Cafe 25.62       6895      L3366
74  2014-01-10 13:47:00 Katerina's Cafe 24.33       2142      L9637
75  2014-01-10 14:08:00 Katerina's Cafe 77.62       7108       <NA>
76  2014-01-10 19:50:00 Katerina's Cafe 19.57       8129       <NA>
77  2014-01-10 19:56:00 Katerina's Cafe 21.89       5921      L3295
78  2014-01-10 20:08:00 Katerina's Cafe 15.76       9405      L3259
79  2014-01-10 20:09:00 Katerina's Cafe  8.14       8411      L6110
80  2014-01-10 20:42:00 Katerina's Cafe 21.28       5407      L4034
81  2014-01-11 13:19:00 Katerina's Cafe 45.22       8156      L5224
82  2014-01-11 13:23:00 Katerina's Cafe 45.21       2418      L9018
83  2014-01-11 13:32:00 Katerina's Cafe 29.10       6895      L3366
84  2014-01-11 13:43:00 Katerina's Cafe 71.65       3484      L2490
85  2014-01-11 13:50:00 Katerina's Cafe 23.45       1321      L4149
86  2014-01-11 14:17:00 Katerina's Cafe 26.02       2142      L9637
87  2014-01-11 19:08:00 Katerina's Cafe 55.67       9617      L5553
88  2014-01-11 19:31:00 Katerina's Cafe 33.18       2142      L9637
89  2014-01-11 19:32:00 Katerina's Cafe 20.71       6691      L6267
90  2014-01-11 19:40:00 Katerina's Cafe 57.36       2540      L5947
91  2014-01-11 19:50:00 Katerina's Cafe 12.55       7889      L6119
92  2014-01-11 19:51:00 Katerina's Cafe 45.68       1310      L8012
93  2014-01-11 19:52:00 Katerina's Cafe 15.72       8411       <NA>
94  2014-01-11 20:00:00 Katerina's Cafe 25.76       4948      L9406
95  2014-01-11 20:06:00 Katerina's Cafe 52.45       8202      L2343
96  2014-01-11 20:13:00 Katerina's Cafe  9.78       9405      L3259
97  2014-01-11 20:24:00 Katerina's Cafe 94.68       2418      L9018
98  2014-01-11 20:46:00 Katerina's Cafe 57.60       1874       <NA>
99  2014-01-12 19:18:00 Katerina's Cafe 12.64       7889      L6119
100 2014-01-12 19:21:00 Katerina's Cafe 11.81       7253      L1682
101 2014-01-12 19:43:00 Katerina's Cafe 40.94       5407      L4034
102 2014-01-12 19:47:00 Katerina's Cafe 17.36       3492       <NA>
103 2014-01-12 20:04:00 Katerina's Cafe 39.30       8129      L8328
104 2014-01-12 20:11:00 Katerina's Cafe 67.14       3547      L9362
105 2014-01-12 20:12:00 Katerina's Cafe 15.37       9405      L3259
106 2014-01-12 20:13:00 Katerina's Cafe 39.55       2142       <NA>
107 2014-01-12 20:17:00 Katerina's Cafe 16.34       8411      L6110
108 2014-01-12 20:35:00 Katerina's Cafe 40.25       4948      L9406
109 2014-01-12 20:48:00 Katerina's Cafe 87.09       1874       <NA>
110 2014-01-13 13:23:00 Katerina's Cafe 38.98       2418      L9018
111 2014-01-13 13:34:00 Katerina's Cafe 26.15       6899      L6267
112 2014-01-13 13:36:00 Katerina's Cafe 17.21       9405       <NA>
113 2014-01-13 13:40:00 Katerina's Cafe  8.29       1874      L4424
114 2014-01-13 13:48:00 Katerina's Cafe 24.26       3492      L7814
115 2014-01-13 13:52:00 Katerina's Cafe 29.55       3547      L9362
116 2014-01-13 19:50:00 Katerina's Cafe 89.83       3547       <NA>
117 2014-01-13 19:59:00 Katerina's Cafe 30.84       6691       <NA>
118 2014-01-13 20:24:00 Katerina's Cafe 35.70       2142      L9637
119 2014-01-13 20:33:00 Katerina's Cafe 91.36       1310       <NA>
120 2014-01-13 20:44:00 Katerina's Cafe 93.57       5407       <NA>
121 2014-01-13 20:47:00 Katerina's Cafe 22.24       7253       <NA>
122 2014-01-13 20:51:00 Katerina's Cafe 55.82       7889       <NA>
123 2014-01-13 21:00:00 Katerina's Cafe 27.59       1874      L4424
124 2014-01-13 21:16:00 Katerina's Cafe 33.11       9405      L3259
125 2014-01-14 13:14:00 Katerina's Cafe 29.12       8156      L5224
126 2014-01-14 13:21:00 Katerina's Cafe 34.45       7792       <NA>
127 2014-01-14 13:22:00 Katerina's Cafe 10.03       6895      L3366
128 2014-01-14 13:32:00 Katerina's Cafe 20.95       7354      L9254
129 2014-01-14 13:34:00 Katerina's Cafe 29.10       4795      L8566
130 2014-01-14 13:41:00 Katerina's Cafe 75.46       3547       <NA>
131 2014-01-14 13:43:00 Katerina's Cafe 31.69       6691      L6267
132 2014-01-14 13:58:00 Katerina's Cafe 33.54       7819      L5259
133 2014-01-14 20:09:00 Katerina's Cafe 91.80       7889       <NA>
134 2014-01-14 20:15:00 Katerina's Cafe 27.67       8129       <NA>
135 2014-01-14 20:17:00 Katerina's Cafe 36.95       3547      L9362
136 2014-01-14 20:19:00 Katerina's Cafe 31.94       5407      L4034
137 2014-01-14 20:26:00 Katerina's Cafe 12.65       1874      L4424
138 2014-01-14 20:34:00 Katerina's Cafe 11.89       3492      L7814
139 2014-01-14 20:42:00 Katerina's Cafe 46.61       8202       <NA>
140 2014-01-14 20:50:00 Katerina's Cafe 25.40       8411      L6110
141 2014-01-14 21:10:00 Katerina's Cafe 18.35       9617      L5553
142 2014-01-15 13:13:00 Katerina's Cafe 24.15       7792      L5756
143 2014-01-15 13:17:00 Katerina's Cafe 88.56       2418       <NA>
144 2014-01-15 13:27:00 Katerina's Cafe 55.60       1286      L3288
145 2014-01-15 14:00:00 Katerina's Cafe 12.12       1310      L8012
146 2014-01-15 19:50:00 Katerina's Cafe 39.23       3492      L7814
147 2014-01-15 19:59:00 Katerina's Cafe 25.08       8129       <NA>
148 2014-01-15 20:00:00 Katerina's Cafe 29.84       2142       <NA>
149 2014-01-15 20:05:00 Katerina's Cafe 11.45       9617      L5553
150 2014-01-15 20:17:00 Katerina's Cafe 29.33       8411      L6110
151 2014-01-15 20:23:00 Katerina's Cafe 39.63       1310      L8012
152 2014-01-15 20:26:00 Katerina's Cafe 61.61       8202       <NA>
153 2014-01-15 20:29:00 Katerina's Cafe 29.66       2418      L9018
154 2014-01-15 21:05:00 Katerina's Cafe 13.77       7253      L1682
155 2014-01-15 21:21:00 Katerina's Cafe 27.48       3547      L9362
156 2014-01-16 13:17:00 Katerina's Cafe 11.08       9683      L7291
157 2014-01-16 13:19:00 Katerina's Cafe 27.82       7688      L4164
158 2014-01-16 13:23:00 Katerina's Cafe 34.90       7792      L5756
159 2014-01-16 13:27:00 Katerina's Cafe 34.02       3484      L2490
160 2014-01-16 13:27:00 Katerina's Cafe 39.74       5407      L4034
161 2014-01-16 13:43:00 Katerina's Cafe 38.32       5368      L2247
162 2014-01-16 18:49:00 Katerina's Cafe 93.68       8411       <NA>
163 2014-01-16 19:39:00 Katerina's Cafe 10.93       7889      L6119
164 2014-01-16 19:47:00 Katerina's Cafe 34.34       3547       <NA>
165 2014-01-16 20:05:00 Katerina's Cafe 69.30       5407       <NA>
166 2014-01-16 20:09:00 Katerina's Cafe 18.79       8129      L8328
167 2014-01-16 20:23:00 Katerina's Cafe 13.91       9617      L5553
168 2014-01-16 20:25:00 Katerina's Cafe 35.43       2418      L9018
169 2014-01-16 20:28:00 Katerina's Cafe 30.84       9405       <NA>
170 2014-01-16 20:36:00 Katerina's Cafe 19.44       2142      L9637
171 2014-01-16 20:39:00 Katerina's Cafe 30.56       8202      L2343
172 2014-01-16 20:51:00 Katerina's Cafe 34.58       3492      L7814
173 2014-01-17 13:42:00 Katerina's Cafe 21.01       3547      L9362
174 2014-01-17 13:48:00 Katerina's Cafe 13.11       1321      L4149
175 2014-01-17 13:53:00 Katerina's Cafe 35.52       4434      L2169
176 2014-01-17 14:15:00 Katerina's Cafe 70.72       4948       <NA>
177 2014-01-17 19:24:00 Katerina's Cafe 22.18       3492      L7814
178 2014-01-17 19:57:00 Katerina's Cafe 34.93       8129       <NA>
179 2014-01-17 20:07:00 Katerina's Cafe 21.22       1310      L8012
180 2014-01-17 20:19:00 Katerina's Cafe 36.12       8202      L2343
181 2014-01-17 20:20:00 Katerina's Cafe 19.09       7253      L1682
182 2014-01-17 20:24:00 Katerina's Cafe 72.13       7889       <NA>
183 2014-01-17 20:58:00 Katerina's Cafe 38.21       2142      L9637
184 2014-01-17 21:05:00 Katerina's Cafe 33.73       5407      L4034
185 2014-01-17 21:06:00 Katerina's Cafe 27.49       9405      L3259
186 2014-01-18 13:15:00 Katerina's Cafe 34.54       6691      L6267
187 2014-01-18 13:34:00 Katerina's Cafe 19.30       3484      L2490
188 2014-01-18 14:00:00 Katerina's Cafe 37.26       8411      L6110
189 2014-01-18 14:03:00 Katerina's Cafe 96.36       7688      L4164
190 2014-01-18 14:06:00 Katerina's Cafe 24.14       7819       <NA>
191 2014-01-18 19:17:00 Katerina's Cafe 21.63       8129       <NA>
192 2014-01-18 19:30:00 Katerina's Cafe  8.14       7889      L6119
193 2014-01-18 19:31:00 Katerina's Cafe 22.74       1874      L4424
194 2014-01-18 19:32:00 Katerina's Cafe 33.41       4434      L2169
195 2014-01-18 19:36:00 Katerina's Cafe 35.79       6691       <NA>
196 2014-01-18 19:46:00 Katerina's Cafe 11.19       8202       <NA>
197 2014-01-18 19:47:00 Katerina's Cafe 16.48       4948       <NA>
198 2014-01-18 19:53:00 Katerina's Cafe 76.10       3547      L9362
199 2014-01-18 19:54:00 Katerina's Cafe 33.63       1310      L8012
200 2014-01-18 19:56:00 Katerina's Cafe 32.68       2418      L9018
201 2014-01-18 20:01:00 Katerina's Cafe 17.37       9405       <NA>
202 2014-01-18 20:04:00 Katerina's Cafe 45.88       8411      L6110
203 2014-01-18 20:09:00 Katerina's Cafe 19.31       7354      L9254
204 2014-01-18 20:13:00 Katerina's Cafe  9.72       3853      L1485
205 2014-01-18 20:15:00 Katerina's Cafe 37.21       7819      L5259
206 2014-01-19 13:00:00 Katerina's Cafe 35.01       7354      L9254
207 2014-01-19 13:23:00 Katerina's Cafe 70.22       4795      L8566
208 2014-01-19 13:53:00 Katerina's Cafe 26.76       4434      L2169
209 2014-01-19 14:01:00 Katerina's Cafe 12.78       1310       <NA>
210 2014-01-19 14:20:00 Katerina's Cafe 54.20       3492       <NA>
211 2014-01-19 18:54:00 Katerina's Cafe 72.25       3547       <NA>
212 2014-01-19 19:53:00 Katerina's Cafe 34.03       7889      L6119
213 2014-01-19 20:05:00 Katerina's Cafe 33.79       1874      L4424
214 2014-01-19 20:22:00 Katerina's Cafe 38.90       9617      L5553

Brew’ve Been Served and Hallowed Grounds are all coffee shop and the peak hours are 7am-8am.

Actually, as loyalty card can be considered as the proof of personal identity, generally, one loyalty card can only be used by one person. And credit can only be owned or used by one person unless there is kinship between two users. Therefore, one credit card can only be related to one loyalty card. Once there are several loyalty cards records under the same credit card, such credit card should be labeled as abnormal one.

abnormal_credit_card <- popular_locations %>%
  drop_na(loyaltynum) %>%
  group_by(last4ccnum) %>%
  summarize(loy_n = n_distinct(loyaltynum)) %>%
  filter(loy_n > 1)
abnormal_credit_card 
# A tibble: 7 x 2
  last4ccnum loy_n
       <int> <int>
1       1286     2
2       4795     2
3       4948     2
4       5368     2
5       5921     2
6       7889     2
7       8332     2

The abnormal credit cards are as shown before, which will be analyzed further in the following parts.

Question2

In this question, the abnormal records are required to combine with vehicle data to dig out more information. The data “car_assignment” is comparably simple which only include some basic information of employees and there is no connections between card information and car information. So the data of gps should be used in this question.

Firstly, we should load the map in the environment.

bgmap <- raster("data_MC2/MC2-tourist.tif")
bgmap
class      : RasterLayer 
band       : 1  (of  3  bands)
dimensions : 1595, 2706, 4316070  (nrow, ncol, ncell)
resolution : 3.16216e-05, 3.16216e-05  (x, y)
extent     : 24.82419, 24.90976, 36.04499, 36.09543  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
source     : MC2-tourist.tif 
names      : MC2.tourist 
values     : 0, 255  (min, max)

Plot raster layer.

tmap_mode("plot")
tm_shape(bgmap) +
  tm_raster(bgmap,
            legend.show = FALSE)

tm_shape(bgmap) +
tm_rgb(bgmap, r = 1,g = 2,b = 3,
       alpha = NA,
       saturation = 1,
       interpolate = TRUE,
       max.value = 255)

Import vector GIS data file.

Abila_st <- st_read(dsn = "data_MC2/Geospatial",
layer = "Abila")
Reading layer `Abila' from data source 
  `F:\Visual Analysis\yining-ai\Makeover1\_posts\2021-07-13-assignment\data_MC2\Geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS:  WGS 84

In order to create relationship between credit card records and gps information, we need to find out the parking time and the driving time associated with each transaction. Here a new variable “Time Difference” is created to calculate the time interval.

gps2 <- read_csv("data_MC2/gps2.csv")
glimpse(gps2)
Rows: 685,169
Columns: 6
$ Timestamp         <chr> "1/6/2014 7:20", "1/6/2014 7:20", "1/6/201~
$ id                <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
$ lat               <dbl> 36.06646, 36.06634, 36.06615, 36.06613, 36~
$ long              <dbl> 24.88258, 24.88259, 24.88258, 24.88258, 24~
$ `Time Difference` <time>       NA, 00:02:00, 00:03:00, 00:01:00, 0~
$ Seconds           <dbl> 0, 2, 3, 1, 3, 1, 1, 1, 4, 1, 1, 2, 3, 1, ~

Change the datatype of variable Timestamp.

gps2$Timestamp <- date_time_parse(gps2$Timestamp,
                zone = "",
                format = "%m/%d/%Y %H:%M")
gps2$id <- as_factor(gps2$id)

To create the roadmap, longitude and latitude should be combined as a coordination point.

gps_sf <- st_as_sf(gps2, 
                   coords = c("long", "lat"),
                       crs= 4326)

To facilitate filtering, variables “day”, “hour” and “minute” can be extracted.

gps_sf$day <- format(gps_sf$Timestamp, format="%d")
gps_sf$hour <- format(gps_sf$Timestamp, format="%H")
gps_sf$minute <- format(gps_sf$Timestamp, format="%M")

Find out those stop time points whose time interval is longer than 3 minutes.

more_than_3mins <- gps_sf %>%
  filter(Seconds >180)
gps_path <- gps_sf %>%
  group_by(id, day, hour, minute) %>%
  summarize(m = mean(Timestamp), 
            do_union=FALSE) %>%
  st_cast("LINESTRING")
p = npts(gps_path, by_feature = TRUE)
gps_path2 <- cbind(gps_path, p)
gps_path2 <- gps_path2 %>%
  filter(p>1)

Draw the roadmap. We can draw movement track at any hour and minute. This can help us to show the employees’ daily lives.

gps_path_selected <- gps_path2 %>%
  filter(day == "06", hour == "08")
tmap_mode("view")
tm_shape(bgmap) +
  tm_rgb(bgmap, r = 1,g = 2,b = 3,
       alpha = NA,
       saturation = 1,
       interpolate = TRUE,
       max.value = 255) +
  tm_shape(gps_path_selected) +
  tm_lines()

In order to find those abnormal points, point graph can be drawn to help build connection between car ID and transaction location. We can firstly select the abnormal credit cards’ information.

card_selected <- card_joined %>%
  filter(last4ccnum == 1286)
gps_dot <- more_than_3mins %>%
  group_by(id, hour, day, minute) %>%
  summarize(geo_n = n_distinct(geometry)) %>%
  st_cast("POINT")

We can draw anyone’s stop point on any day at any time. Then set the abnormal time to dig out the suspicious activities.

In this way, we can find the ownerships of credit cards.

abnormal_cc <- read_csv("data_MC2/abnormal_cc.csv")
abnormal_cc
# A tibble: 9 x 6
  CC_number Loyalty_number Car_ID Name            CurrentEmploymentTy~
      <dbl> <chr>          <chr>  <chr>           <chr>               
1      1286 L3572          22     Adra Nubarron   Security            
2      1286 L3288          22     Adra Nubarron   Security            
3      4795 L8566          34     Edvard Vann     Security            
4      4948 L3295          18     Birgitta Frente Engineering         
5      5921 L9406          29     Bertrand Ovan   Facilities          
6      5921 L3295          29     Bertrand Ovan   Facilities          
7      7889 L6119          8      Lucas Alcazar   Information Technol~
8      7889 L2247          8/22/6 -               -                   
9      8332 L2070          10     Ada Campo-Corr~ Executive           
# ... with 1 more variable: CurrentEmploymentTitle <chr>

Question 3

In this question, we can use the same method to draw the dot graph to recognize all the owners of credit cards as the Questions shown before. Now we can infer all the ownerships.

total_match <- read_csv("data_MC2/total_match.csv")
total_match
# A tibble: 44 x 6
   CC_number Loyalty_number Car_ID Name           CurrentEmploymentTy~
       <dbl> <chr>           <dbl> <chr>          <chr>               
 1      9551 L5777               1 Nils Calixto   Information Technol~
 2      1415 L7783               2 Lars Azada     Engineering         
 3      9635 L3191               3 Felix Balas    Engineering         
 4      7688 L4164               4 Ingrid Barran~ Executive           
 5      6899 L6267               5 Isak Baza      Information Technol~
 6      7253 L1682               6 Linnea Bergen  Information Technol~
 7      2540 L5947               7 Elsa Orilla    Engineering         
 8      1877 L3014               9 Gustav Cazar   Engineering         
 9      1311 L4149              11 Axel Calzas    Engineering         
10      7108 L6544              12 Hideki Cocina~ Security            
# ... with 34 more rows, and 1 more variable:
#   CurrentEmploymentTitle <chr>

Question 4

When compare the transaction records and gps information, we found that some vehicles were in the same geographical location at the same time. Such institution can be regarded as suspicious ones and to draw their
movement track to explore their relationship.

The first suspicious pair we find is card ID 33 and 7. The line graph and dot graph are dyrawn below. The two visited the Chostus Hotel several time at 13pm and visited restaurants at evening. And they even have some tracks in the same apartment. The owners of the cars are Elsa Orilla and Brand Tempestad seperately. And they have the same employment type and employment title.

gps_path_selected2 <- gps_path2 %>%
  filter(id %in% c(7,33))
tmap_mode("view")
tm_shape(bgmap) +
  tm_rgb(bgmap, r = 1,g = 2,b = 3,
       alpha = NA,
       saturation = 1,
       interpolate = TRUE,
       max.value = 255) +
  tm_shape(gps_path_selected2) +
  tm_lines(col = 'id', style = "fixed")

The second suspect group are Car ID 22, 30 and 15. There is a lot of overlap in the tracks of these cars. They often buy coffee in Brew’ve been Served and often have lunch in the nearby restaurants. The owners of the three cars are Nubarron Adra, Loreto Bodrogi and Felix Resumir. These three employees are all security guards. Therefore, we can assume that they have comparably close relationship.

gps_path_selected2 <- gps_path2 %>%
  filter(id %in% c(22,30,15))
tmap_mode("view")
tm_shape(bgmap) +
  tm_rgb(bgmap, r = 1,g = 2,b = 3,
       alpha = NA,
       saturation = 1,
       interpolate = TRUE,
       max.value = 255) +
  tm_shape(gps_path_selected2) +
  tm_lines(col = 'id', style = "fixed")

Question 5

We draw dot plot at 2am, 3am, 4am, and 5am when generally very few people are outside. 15, 21, 24 and 16 are suspicious car id. There are several people gathering in the Frydo’s Autosupply N More. Someone even drive to parks far from the downtown center. Therefore, there are some suspicious activity locations: Spetson Park, Taxiarchon Park and Frydo’s Autosupply N More.